Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

search for function* #3114

Merged
merged 3 commits into from
Mar 3, 2021
Merged

search for function* #3114

merged 3 commits into from
Mar 3, 2021

Conversation

peterbe
Copy link
Contributor

@peterbe peterbe commented Mar 3, 2021

Fixes #3113

Can't argue with the results:
Screen Shot 2021-03-03 at 11 08 56 AM

The reason the Using Fetch document turns up there is because that page has such a high popularity that simply matching the start of the word gives it a tiny matchness and combined with its popularity it comes up pretty high.
But the most important thing is the first search result which is what people expect.

for key in keys:
print(f"{key:{longest_key + 1}} {token[key]!r}")
print()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just needed the above fixed for my local debugging because after I've run the indexing I run many different variables of

poetry run deployer search-analyze "function*"

@peterbe peterbe requested a review from escattone March 3, 2021 16:48
@peterbe
Copy link
Contributor Author

peterbe commented Mar 3, 2021

Note-to-self; Once this lands, remember to run a Dev build because that's what many of us are using for our local kuma docker-compose when we don't want to run a local elasticsearch server.

# is a "risk" worth accepting because event `*emphasis*` would get converted
# to `emphasisstar` which is more accurate than letting the standard tokenizer
# turn it into `emphasis` alone.
pattern="([\\w]+)-?\\*",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. It seems this could be simplified to:

Suggested change
pattern="([\\w]+)-?\\*",
pattern="(\\w+)-?\\*",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I think it's because I used to have [a-z] in there but that implies all lowercase. And Function* is a real word, so I later changed it to \w and forgot to remove the [ and ].
Good catch!
Also, \w means [a-zA-Z_0-9] which I'm not sure is right. But it's hard to tell.
Currently, based on looking at every single en-US document title I think the regex is sane. That _ and 0-9 stuff might make a difference within the texts of the body but as long as the first result is what people expect, I'm not too worried.

@peterbe peterbe merged commit e0fee91 into mdn:main Mar 3, 2021
@peterbe peterbe deleted the 3113-search-for-function branch March 3, 2021 20:09
peterbe added a commit to peterbe/yari that referenced this pull request Jun 1, 2021
* search for function*

Fixes mdn#3113

* feedbacked
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Search for function* should find the page about function*
2 participants